MLapSVM-LBS: Predicting DNA-binding proteins via a multiple Laplacian regularized support vector machine with local behavior similarity
نویسندگان
چکیده
DNA-binding proteins (DBPs) are of great significance in many basic cellular processes. Experiment-based methods for identifying DBPs costly and time-consuming. To deal with large-scale DBP identification tasks, a variety computation-based have been developed. Inspired by previous work, we propose multiple Laplacian regularized support vector machine local behavior similarity (MLapSVM-LBS) to predict DBP. We serially combine three features that extracted from protein sequences (including PsePSSM, GE, NMBAC) feed them into MLapSVM-LBS. Based on human learning theory, MLapSVM-LBS can better represent the relationship between samples through similarity. introduce new edge weight calculation method takes label information consideration. In addition, distribution parameter reflecting underlying probability sample’s neighborhood is also employed. further improve robustness model, utilize regularization build multigraph model which five graphs constructed changing size. appraise performance our trained tested PDB186, PDB1075, PDB2272 PDB14189 datasets. On two independent testing sets (PDB186 PDB2272), reaches accuracies 0.887 0.712, respectively. The good results both datasets demonstrate reliable model.
منابع مشابه
A New Formulation for Cost-Sensitive Two Group Support Vector Machine with Multiple Error Rate
Support vector machine (SVM) is a popular classification technique which classifies data using a max-margin separator hyperplane. The normal vector and bias of the mentioned hyperplane is determined by solving a quadratic model implies that SVM training confronts by an optimization problem. Among of the extensions of SVM, cost-sensitive scheme refers to a model with multiple costs which conside...
متن کاملThe Doubly Regularized Support Vector Machine
The standard L2-norm support vector machine (SVM) is a widely used tool for classification problems. The L1-norm SVM is a variant of the standard L2norm SVM, that constrains the L1-norm of the fitted coefficients. Due to the nature of the L1-norm, the L1-norm SVM has the property of automatically selecting variables, not shared by the standard L2-norm SVM. It has been argued that the L1-norm SV...
متن کاملIdentification of DNA-Binding Proteins Using Support Vector Machine with Sequence Information
DNA-binding proteins are fundamentally important in understanding cellular processes. Thus, the identification of DNA-binding proteins has the particularly important practical application in various fields, such as drug design. We have proposed a novel approach method for predicting DNA-binding proteins using only sequence information. The prediction model developed in this study is constructed...
متن کاملKernel-based machine learning protocol for predicting DNA-binding proteins
DNA-binding proteins (DNA-BPs) play a pivotal role in various intra- and extra-cellular activities ranging from DNA replication to gene expression control. Attempts have been made to identify DNA-BPs based on their sequence and structural information with moderate accuracy. Here we develop a machine learning protocol for the prediction of DNA-BPs where the classifier is Support Vector Machines ...
متن کاملSupport vector machines for predicting rRNA-, RNA-, and DNA-binding proteins from amino acid sequence.
Classification of gene function remains one of the most important and demanding tasks in the post-genome era. Most of the current predictive computer methods rely on comparing features that are essentially linear to the protein sequence. However, features of a protein nonlinear to the sequence may also be predictive to its function. Machine learning methods, for instance the Support Vector Mach...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Knowledge Based Systems
سال: 2022
ISSN: ['1872-7409', '0950-7051']
DOI: https://doi.org/10.1016/j.knosys.2022.109174